kACTUS 2: Privacy Preserving in Classification Tasks Using k-Anonymity

نویسندگان

  • Slava Kisilevich
  • Yuval Elovici
  • Bracha Shapira
  • Lior Rokach
چکیده

k-anonymity is the method used for masking sensitive data which successfully solves the problem of re-linking of data with an externa l source and makes it difficul t to l'e-iden tify the individual. T hus kanonymity works on a set of quasi-identifiers (public sensitive at t ributes), whose possible availability and linking is anticipated from external dataset , and demands that the released dataset will contain at least k records for every possible quasi-identifier value. Another aspect of k is its capability of mainta ining the truthfulness of the released data (unlike other existing methods) . This is achieved by generalization, a primary technique in k-anonymity. Generalization consists of generalizing attribute values and substituting them wit h semantically consistent but less precise values. When the substituted value doesn 't preserve semantic validity the technique is called suppression which is a private case of generalization. We present a hybrid approach called compensation which is based on suppression and swapping for achieving privacy. Since swapping decreases the truthfulness of attribu te values there is a tradeoff between level of swapping (information trut hfulness) and suppression (information loss) incorporated in our algorit hm. We use k-anonymity to explore the issue of anonymity preservation. Since we do not use generalization, we do not need a priori knowledge of at t ribute semant ics. We investigate dat a anonymization in t he context of classification and use tree propert ies to satisfy k-anonymization. Our work improves previous approaches by increasing classification accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy-preserving data mining: A feature set partitioning approach

In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for ...

متن کامل

Enhancing Informativeness in Data Publishing while Preserving Privacy using Coalitional Game Theory

k-Anonymity is one of the most popular conventional techniques for protecting the privacy of an individual. The shortcomings in the process of achieving k-Anonymity are presented and addressed by using Coalitional Game Theory (CGT) [1] and Concept Hierarchy Tree (CHT). The existing system considers information loss as a control parameter and provides anonymity level (k) as output. This paper pr...

متن کامل

A Survey of Privacy Preserving Data Publishing using Generalization and Suppression

Nowadays, information sharing as an indispensable part appears in our vision, bringing about a mass of discussions about methods and techniques of privacy preserving data publishing which are regarded as strong guarantee to avoid information disclosure and protect individuals’ privacy. Recent work focuses on proposing different anonymity algorithms for varying data publishing scenarios to satis...

متن کامل

A Novel Anonymity Algorithm for Privacy Preserving in Publishing Multiple Sensitive Attributes

Publishing the data with multiple sensitive attributes brings us greater challenge than publishing the data with single sensitive attribute in the area of privacy preserving. In this study, we propose a novel privacy preserving model based on k-anonymity called (α, β, k)-anonymity for databases. (α, β, k)anonymity can be used to protect data with multiple sensitive attributes in data publishing...

متن کامل

Privacy Preserving Data Publishing Based on k-Anonymity by Categorization of Sensitive Values

In many organizations large amount of personal data are collected and analyzed by the data miner for the research purpose. However, the data collected may contain sensitive information which should be kept confidential. The study of Privacypreserving data publishing (PPDP) is focus on removing privacy threats while, at the same time, preserving useful information in the released data for data m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008